NHS-R Community Workshop
2024-02-22
R code
01_initial.R
“An encapsulation of a task”?
The function, in a way, doesn’t need to care about what’s under the mask.
Whatever it is, just (try to) do this to it.
Fenella’s job is to check that the parcel has the right label on it. If it does, she will allow it into the warehouse (function).
R code
02_arguments.R
If you have code that looks like this:
“If you find yourself writing the same thing more than twice, turn it into a function.”
To which I would add,
“If you find yourself using a function in more than one project, add it to a package.”
# A tibble: 87 × 14
name height mass hair_color skin_color eye_color birth_year sex gender
<chr> <int> <dbl> <chr> <chr> <chr> <dbl> <chr> <chr>
1 Luke Sk… 172 77 blond fair blue 19 male mascu…
2 C-3PO 167 75 <NA> gold yellow 112 none mascu…
3 R2-D2 96 32 <NA> white, bl… red 33 none mascu…
4 Darth V… 202 136 none white yellow 41.9 male mascu…
5 Leia Or… 150 49 brown light brown 19 fema… femin…
6 Owen La… 178 120 brown, gr… light blue 52 male mascu…
7 Beru Wh… 165 75 brown light blue 47 fema… femin…
8 R5-D4 97 32 <NA> white, red red NA none mascu…
9 Biggs D… 183 84 black light brown 24 male mascu…
10 Obi-Wan… 182 77 auburn, w… fair blue-gray 57 male mascu…
# ℹ 77 more rows
# ℹ 5 more variables: homeworld <chr>, species <chr>, films <list>,
# vehicles <list>, starships <list>
droids_mean_height <- starwars |>
filter(species == "Droid") |>
summarise(droid_mean_height = mean(height, na.rm = TRUE))
droids_mean_height# A tibble: 1 × 1
droid_mean_height
<dbl>
1 131.
ewoks_mean_height <- starwars |>
filter(species == "Ewok") |>
summarise(ewok_mean_height = mean(height, na.rm = TRUE))
ewoks_mean_height# A tibble: 1 × 1
ewok_mean_height
<dbl>
1 88
First steps: name it and wrap it
Note that we don’t need to worry about the output will be called (if anything). The user of the function will decide that when they run it.
R code
03_refactoring.R
lapply() or purrr::map(), if useful.I remember it taking me a while, when I first tried understanding functions, for them to ‘click’ for me.
I didn’t understand the difference between variables inside the function and outside it. I didn’t really get what the point was.
I didn’t realise the power of the abstraction of using variables (as function arguments), instead of just passing in values.
Eventually I realised that this means your code can work harder for you. It can do its thing in different situations, and work with different inputs.
To quote Hadley Wickham & co.:
R for Data Science, Functions section
- You can give a function an evocative name that makes your code easier to understand.
- As requirements change, you only need to update code in one place, instead of many.
- You eliminate the chance of making incidental mistakes when you copy and paste (i.e. updating a variable name in one place, but not in another).
- It makes it easier to reuse work from project-to-project, increasing your productivity over time.
purrr::map()dplyr::mutate(), for example